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Systems 


Ramesh  Neelamani 

Abstract 

This  thesis  proposes  a  new  approach  to  wavelet-based  image  deconvolution  that  com¬ 
prises  Fourier-domain  system  inversion  followed  by  wavelet-domain  noise  suppression. 
In  contrast  to  other  wavelet-based  deconvolution  approaches,  the  algorithm  employs 
a  regularized  inverse  filter ,  which  allows  it  to  operate  even  when  the  system  is  non- 
invertible.  Using  a  mean-square-error  metric,  we  strike  an  optimal  balance  between 
Fourier-domain  regularization  that  is  matched  to  the  system  and  wavelet-domain 
regularization  that  is  matched  to  the  input  signal.  The  resultant  algorithm  is  fast, 
O  (A  log*  IV)  where  N  denotes  the  number  of  samples,  and  is  well-suited  to  signals 
and  images  with  spatially-localized  phenomena  such  as  edges.  In  addition  to  enjoy¬ 
ing  asymptotically  optimal  rates  of  error  decay  for  some  systems,  the  algorithm  also 
achieves  excellent  performance  at  fixed  data  lengths.  In  simulations  with  real  data, 
the  algorithm  outperforms  the  conventional  LTI  Wiener  filter  and  other  wavelet-based 
deconvolution  algorithms  in  terms  of  both  visual  quality  and  MSE  performance. 
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Chapter  1 
Introduction 

Deconvolution  is  a  recurring  theme  in  a  wide  variety  of  signal  and  image  processing 
problems,  from  channel  equalization  [1]  to  image  restoration  [2],  The  importance  of 
deconvolution  can  be  grasped  with  a  real-life  example  from  image  restoration.  Satel¬ 
lite  images  obtained  in  practice  are  often  blurred  due  to  limitations  such  as  aperture 
effects  of  the  camera,  camera  motion  and  atmospheric  turbulence.  Deconvolution  be¬ 
comes  necessary  to  get  a  crisp,  deblurred  image  that  is  needed  for  viewing  or  further 
processing. 

In  its  simplest  form,  the  1-d  deconvolution  problem  runs  as  follows.  The  desired 
signal  x  is  input  to  a  known  linear  time-invariant  (LTI)  system  %  having  impulse 
response  h.  Independent  identically  distributed  (i.i.d.)  samples  of  Gaussian  noise  7 
with  variance  a2  corrupt  the  output  samples  of  the  system  %.  The  observations  at 
discrete  points  tn  are  given  by 

y(tn)  ■=  (x  ©  h)(tn)  +  7(71)  n  =  1 , . . . , N.  (1.1) 

For  simplicity,  we  assume  circular  convolution,  denoted  by  ©  in  (1.1).  Given  y.  we 
seek  to  estimate  x.  In  the  Discrete  Fourier  Transform  (DFT)  domain,  we  equivalently 
have 

Y(fn)  =  H(fn )  X(fn)  +  r(/n)  n  =  1, . . . ,  N,  (1.2) 

where  Y.  H,  X  and  T  are  the  length-lV  DFTs  of  the  observation  y,  LTI  system 
impulse  response  h ,  input  x  and  noise  7  respectively.  The  fn  :=  ^(n  —  1)  denote 
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Figure  1.1  :  Convolution  model  setup. 


the  normalized  frequencies  in  the  DFT  domain.  The  problem  formulation  trivially 
extends  to  multi-dimensional  data. 

If  the  system  frequency  response  H(fn)  has  no  zeros,  then  an  unbiased  estimate 
of  x  can  be  obtained  by  inverting  %  as 

x(f„)  :=  H-\fn)Y(fn) 

=  X(fn)  +  H-'(fn)r(fn).  (1.3) 

However,  if  the  system  is  ill-conditioned,  i.e.,  H(fn )  is  small  at  any  fn,  then  enormous 
noise  amplification  results  during  inversion,  resulting  in  an  extremely  noisy,  useless 
estimate. 

The  problem  of  noise  amplification  can  be  alleviated  by  using  an  approximate, 
regularized  inverse  instead  of  a  pure  inverse.  Regularization  aims  to  provide  a  bet¬ 
ter  solution  by  reducing  noise  in  exchange  for  some  distortion  in  the  estimate  [3]; 
regularization  becomes  essential  in  situations  involving  ill-conditioned  systems. 

Since  the  exact  input  is  generally  never  recoverable,  different  measures  can  be  used 
to  evaluate  the  quality  of  an  estimate;  one  commonly  used  measure  of  performance  is 
the  mean  squared  error  (MSE)  between  the  actual  input  x  and  the  estimate.  The  LTI 
Wiener  deconvolution  filter  is  a  commonly  used  example  of  regularized  deconvolution. 
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It  provides  the  MSE-optimal  regularized  LTI  solution  to  the  deconvolution  problem. 
Formally,  the  estimation  procedure  used  by  the  LTI  Wiener  deconvolution  filter  can 
be  understood  as  follows:  (1)  Invert  the  convolution  operator  using  y  to  obtain 
a  noisy  estimate  X(fn).  (2)  Regularize  by  shrinking  each  frequency  component  of 
X (fn)  proportionately  according  to  the  signal-to-noise-Ratio  (SNR)  at  fn  [4];  shrink 
less/more  when  SNR  is  high/low. 

The  Fourier  representation  diagonalizes  the  convolution  operator;  hence  the  LTI 
Wiener  deconvolution  filter  can  precisely  identify  and  attenuate  the  noise  that  gets 
amplified  during  inversion  of  7i,  thereby  fully  exploiting  the  structure  of  the  blurring 
system.  In  fact,  when  the  input  signal  can  be  modeled  as  wide-sense  stationary 
(WSS)  and  Gaussian,  the  LTI  Wiener  deconvolution  filter  is  MSE-optimal  over  all 
estimators. 

However,  the  LTI  Wiener  filter  does  not  provide  a  good  solution  when  the  input 
signal  contains  spatially  localized  phenomena  such  as  discontinuities,  eg.,  edges  (refer 
Figure  1.2(f)).  The  reason  behind  this  drawback  is  the  following:  Though  the  Fourier 
domain  exploits  the  structure  of  convolution  operator  %  well,  the  domain  is  not 
well-suited  to  represent  input  signals  with  spatially  localized  features  because  the 
Fourier  basis  functions  have  support  that  extend  over  the  entire  spatial  domain.  So, 
LTI  Wiener  filter  processing  lacks  spatial  selectivity,  consequently  important  spatial 
features  such  as  edges  get  smeared. 

Over  the  last  decade,  the  wavelet  transform  have  proven  to  be  an  invaluable  tool 
for  dealing  with  a  wide  class  of  signal  including  signals  with  spatially  localized  fea¬ 
tures.  Wavelet  representations  use  basis  functions  that  are  localized  in  time  and 
frequency.  Wavelets  provide  economical  representations  to  a  large  class  of  signals 
such  as  signals  belonging  to  Besov  spaces  [8];  wavelets  capture  most  of  the  signal 
energy  using  a  few  large  wavelet  coefficients.  The  ability  of  wavelets  to  economically 
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(g)  (h) 

Figure  1.2  :  (a)  Test  signal  x  (N  —  1024 ).  It  is  a  concatenation  of  Donoho’s  Blocks 

and  Heavisine  signals  [5].  (b)  Observed  signal  y  (MSE=0.0060).  It  is  obtained  by  blurring 
x  with  by  H  and  adding  noise,  (c)  System  frequency  response  H.  It  contains  a  zero  at 
index  -j-  (index  -y  corresponds  to  the  normalized  DFT  frequency  f  —  0.5 ).  (d)  System 
Inverse  frequency  response  H  1 .  It  amplifies  noise  at  high  frequencies.  The  mirror  wavelet 
frequency  splits  for  estimate  (g)  are  shown  with  dashed  vertical  lines,  (e)  Pure  inverse  esti¬ 
mate  (MSE=1.1056).  The  pure  inverse  jj  amplifies  all  the  high  frequency  noise  components 
to  give  an  extremely  noisy  estimate,  (f)  Wiener  filter  estimate  (MSE=0. 0048).  The  Wiener 
filter  is  ineffective  since  the  signal  contains  spatially  localized  phenomena  such  as  edges; 
hence  it  provides  an  unsatisfactory  estimate,  (g)  Mirror  wavelet  basis  estimate  [6,  pp.  458- 
461]  (MSE=0.0027).  The  frequency  splits  (shown  using  dotted  lines  in  (d))  aim  to  isolate 
the  frequency  where  H  goes  to  zero.  The  estimate  performs  better  than  the  WVD  method 
of  [7]  and  conventional  Wiener  filter,  (h)  WaRD  estimate  with  regularization  a*  —  0.06, 
(MSE=0.0017).  The  WaRD  system  tackles  the  noise  amplification  that  stems  from  the 
action  of  II  1  (refer  (d))  by  employing  partial  Fourier-domain  regularization  followed  by 
wavelet-domain  estimation.  It  easily  outperforms  the  mirror  wavelet  approach  as  well. 
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represent  a  wide  class  of  signals  has  been  leveraged  into  powerful,  spatially  adaptive, 
signal-estimation  algorithms  that  are  based  on  simply  shrinking  the  wavelet  coeffi¬ 
cients  of  the  noisy  signal.  Further,  such  simple  wavelet-based  estimation  schemes  are 
MSE  near-optimal  for  a  wide  class  of  signals;  they  outperform  all  linear  estimation 
techniques  [9]. 

The  problem  of  deconvolution  is  equivalent  to  signal  estimation  in  the  presence 
of  noise  colored  by  7/-1  (refer  (1.3)).  This  motivates  the  following  wavelet-based 
approach  to  the  problem  of  deconvolution:  (1)  Invert  the  convolution  operator  using 

y  to  obtain  a  noisy  estimate  X (fn).  (2)  Regularize  by  shrinking  each  wavelet 
coefficient  of  the  noisy  estimate  X (  fn)  using  the  noise  variance  at  each  wavelet  scale. 

Wavelet-based  deconvolution  techniques,  hereby  called  WaD,  aim  to  exploit  the 
economical  wavelet  representation  of  signals  to  effectively  identify  and  estimate  the 
signal.  In  fact,  for  special  classes  of  blurring  operators  such  as  the  Radon  transform, 
operator  inversion  followed  by  wavelet-domain  shrinkage  is  near-optimal  for  a  wide 
class  of  input  signals  [7, 10]. 

However,  a  WaD  scheme  cannot  provide  good  estimates  for  all  convolution  systems 
H.  In  fact,  the  wavelet-based  scheme  would  provide  a  zero  signal  for  an  estimate  in 
the  presence  of  most  systems  that  are  non-invertible;  eg.,  when  the  impulse  response 
h  is  a  box-car,  a  commonly  used  model  in  image  blurring  [2],  the  noisy  estimate 
x  obtained  after  system  inversion  has  infinite  noise  variance  at  all  wavelet  scales, 
rendering  wavelet-domain  estimation  ineffective.  In  such  a  case,  some  amount  of 
Fourier-domain  regularization  becomes  imperative. 

In  this  thesis,  we  propose  an  improved  hybrid  wavelet-based  regularized  decon¬ 
volution  (WaRD)  algorithm  suitable  for  use  with  any  ill-conditioned  systems.  The 
basic  idea  is  simple:  employ  both  Fourier-domain  (Wiener-like)  regularized  inversion 
and  wavelet-domain  signal  estimation.  During  this  tandem  processing,  we  exploit 
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Fourier-domain  regularization  to  adapt  to  the  convolution  system  and  thereby  con¬ 
trol  the  noise  but  use  it  sparingly  to  keep  the  accompanying  smearing  distortions  to 
the  minimum  required;  bulk  of  the  noise  removal  and  signal  estimation  is  achieved 
using  wavelet  shrinkage. 

Using  an  MSE  metric1,  we  show  that  the  optimal  balance  between  local  processing 
with  wavelet  basis  and  global  processing  with  Fourier  basis  is  heavily  influenced  by 
the  distribution  of  input  signal  energy  in  the  wavelet  domain,  and  the  convolution 
operator. 

Interestingly,  one  extreme  of  the  balance  is  to  perform  no  Fourier-domain  reg¬ 
ularization;  this  is  similar  to  the  approaches  of  [6,7].  In  [7],  Donoho  introduced 
the  Wavelet- Vaguellete  Decomposition  (WVD)  technique  to  solve  general  linear  in¬ 
verse  problems.  For  the  special  case  of  deconvolution,  WVD  is  equivalent  to  WaD 
as  described  above.  Donoho  proved  that,  for  a  certain  special  class  of  operators  that 
includes  the  Radon  transform,  WaD  possesses  asymptotically  optimal  rates  of  error 
decay  as  the  number  of  observation  samples  increases.  Since  WaRD  subsumes  WaD, 
the  proposed  WaRD  technique  also  possesses  asymptotically  optimal  error  decay  rates 
for  such  special  operators.  In  addition,  WaRD  also  improves  on  the  performance  of 
WaD  at  any  fixed  samples  by  choosing  the  optimal  amount  of  regularization.  Further, 
unlike  WaD,  WaRD  is  applicable  in  the  presence  of  any  convolution  operator. 

Recently,  Mallat  proposed  the  use  of  a  mirror-wavelet  basis  that  adapts  to  the 
frequency  response  of  convolution  operator  7~l  to  improve  on  the  performance  of  WaD 
when  the  frequency  response  of  H  is  as  shown  in  Figure  1.2(b)  [11].  Comparing 
Figures  1.2(g)  and  1.2(h),  we  see  that  WaRD  provided  a  better  estimate  than  that 

1Even  though  MSE  metric  does  not  capture  visual  appeal  of  images  in  general,  we  found  that  in 
case  of  wavelet-based  deconvolution,  optimizing  over  an  MSE  metric  gave  satisfactory  results. 
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obtained  by  an  adapted  basis  approach.  Though  an  adapted  basis  provides  better  re¬ 
sults  than  WaD  (not  shown),  it  is  not  effective  for  all  types  of  ill-conditioned  systems, 
eg.,  when  %  has  a  boxcar  system  response,  adapting  to  the  sine  frequency  response 
of  %  using  wavelets  fails. 

Nowak  has  employed  an  under-regularized  filter  in  [12]  during  system  inversion 
and  subsequently  used  wavelet-domain  signal  estimation  since  the  blurring  operator 
in  the  problem  was  non-invertible.  However  he  did  not  study  the  implications  of  using 
regularized  inversion  and  the  choice  of  the  optimal  amount  of  regularization. 

After  discussing  regularization  in  more  depth  in  Chapter  2,  we  briefly  review 
wavelet  transforms  and  its  properties  in  Chapter  3.  Previous  wavelet-based  decon¬ 
volution  approaches  are  discussed  in  Chapter  4.  We  present  our  improved  scheme  in 
Chapter  5  and  elaborate  on  its  implementation  in  Chapter  6.  Illustrative  examples 
lie  in  Chapter  7.  We  conclude  by  summarizing  our  work  and  sketching  out  future 
directions  in  Chapter  8. 
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Chapter  2 

Regularized  Inversion  of  Ill-Conditioned  Systems 

Deconvolution  involves  estimating  the  input  x  from  a  blurred  and  noisy  observation 
y  (refer  Eqn.  (1.2)).  Simply  inverting  the  system  fails  to  provide  a  satisfactory 
solution.  The  problems  that  arise  can  be  understood  by  analyzing  the  Singular  Value 
Decomposition  (SVD)  of  the  operator  %. 

2.1  System  Inversion  using  SVD 

Let  h®  z  denote  the  action  of  the  convolution  operator  %  on  some  signal  z.  This  can 
be  expressed  in  terms  of  the  eigenvectors  en  and  eigenvalues  Xn  of  the  operator  %  as 
follows. 

h®z  =  ^2\n(z,en)en,  (2.1) 

n 

where  (z,  en)  denotes  an  inner  product  between  z  and  en.  en  and  Xn  are  obtained  by 
the  SVD  of  the  operator  %.  The  action  of  the  inverse  of  %  on  any  signal  z,  denoted 
by  V1  ©  z,  is  given  by 

V1  ®  z  —  Arj1(z,  en)en.  (2.2) 

n 

Eqn.  (2.2)  reveals  that  the  projection  of  z  along  eigenvectors  that  correspond  to 
small  eigenvalues  of  H,  get  amplified  during  inversion.  For  any  convolution  operator, 
the  sinusoidal  DFT  vectors  form  the  eigenvectors  and  the  DFT  coefficients  of  the 
system  impulse  response  h  give  the  eigenvalues  of  the  system  %.  So,  during  inversion 
of  the  convolution  operator  (refer  Eqn.  (1.3)),  the  frequency  components  of  noise 
corresponding  to  small  Fourier  coefficients  of  h  are  amplified  during  inversion. 
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An  easy  fix  to  the  problem  of  noise  amplification  during  inversion  is  to  employ  a 
windowed-SVD  inversion  [3]. 

x  —  h ©  y 

'/’  ^  (fb  en)en 

n 

=  ^rnAn1(/i©  x  +  7,en)en 

n 

=  ^rn(x,en)en  +  rnAn1(7,  en)en,  (2.3) 

n  n 

where  rn  are  the  weights  chosen  to  shrink  the  components  of  noise  7  that  get  amplified 
during  inversion  of  %.  This  achieves  noise  reduction  albeit  at  the  cost  of  some  bias 
in  the  estimate.  The  resultant  operation  is  a  common  form  of  regularization  that 
achieves  quasi-inversion  of  %. 

2.2  LTI  Wiener  Deconvolution 

For  a  convolution  operator,  the  windowed-SVD  inversion  is  equivalent  to  separately 
shrinking  each  Fourier  component  of  the  noisy  observation  obtained  after  pure  inver¬ 
sion  to  attenuate  the  noise.  The  optimal  weights  that  can  be  used  to  shrink  each 
Fourier  component  during  regularized  inversion  is  a  function  of  the  SNR  at  each 
frequency.  For  WSS  signals,  the  optimal  weights  R(fn )  are  given  by 

,  \H(fn)\2P*(U) 

’’  \HUn)'fPAfn)  +  <y2' 

where  Px(fn )  is  the  power  spectral  density  of  the  stochastic  signal  x  and  a2  is  the 
variance  of  additive  noise  7.  The  resulting  LTI  quasi-inverse  operator  is  called  the 
LTI  Wiener  deconvolution  filter. 

The  action  of  the  LTI  Wiener  deconvolution  filter  can  be  understood  as  being 
composed  of  the  two  steps1(refer  Figure  2.1). 

1Both  these  steps  are  simultaneously  employed  in  practice.  So  the  LTI  Wiener  deconvolution 
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Figure  2.1  :  Block  diagram  of  Wiener  deconvolution  filtering:  Fourier  domain  shrinkage  is 
used  to  estimate  the  signal  in  the  presence  of  noise  colored  during  inversion  using  II  1 . 


1.  Pure  Inversion:  The  noisy  and  blurred  observation  y  is  treated  with  H to 
obtain  a  noisy,  unbiased  estimate  x  of  the  input  signal  as  explained  in  Eqn.  (1.3). 
This  step  necessarily  amplifies  the  noise  components  at  frequencies  where  H(fn) 
is  small.  Further  processing  is  required  to  estimate  the  signal  from  this  ampli¬ 
fied,  colored  noise. 

2.  Fourier-domain  Signal  Estimation:  Each  frequency  component  of  the  noisy 
signal  obtained  after  pure  inversion  is  shrunk  using  weights  R(fn).  From  ex¬ 
pression  (2.4),  it  is  clear  that  less  shrinkage  is  employed  at  components  where 
the  SNR  is  high.  The  LTI  Wiener  deconvolution  estimate  of  the  input  signal 
is  [4] 

X(fn)  ■■=  fi(/j(]Ty)n/J  (2.5) 

=  fi(/„W„)  +  fl(/n)(¥W)r(/n).  (2,6) 

For  Gaussian  WSS  signals,  the  LTI  Wiener  deconvolution  provides  the  globally 
MSE-optimal  estimate  of  the  input  since  the  Fourier  domain  ideally  represents  both , 
the  colored  noise  after  inversion,  and  the  signal  of  interest.  The  Fourier  coefficients 
of  the  signal  and  the  noise  colored  by  H are  independent  making  individual,  scalar 


filter  is  applicable  even  when  the  system  H  is  non-invertible. 


11 


estimation  in  the  Fourier  domain  optimal.  However,  when  the  input  signal  or  noise 
is  not  Gaussian,  Wiener  filtering  no  longer  remains  optimal. 

When  the  input  signal  x  is  deterministic,  the  optimal  weights  are  obtained  by  sub¬ 
stituting  Px(fn)  —  \X(fn)\2  in  Eqn.  (2.4)2.  In  the  deterministic  case,  the  conditions 
when  the  Wiener  estimate  (or  in  general  the  windowed-SVD)  provides  the  optimal 
solution  get  more  involved.  We  just  mention  that  smoothness  constraints  need  to  be 
imposed  on  the  signal  and  the  convolution  operator  so  the  windowed-SVD  is  optimal. 
We  refer  the  reader  to  [7, 13]  for  further  details.  We  have  assumed  a  deterministic 
signal  for  the  rest  of  the  thesis. 

2.3  Drawbacks  of  the  Wiener  Deconvolution  Filter 

The  LTI  Wiener  filter  does  not  provide  a  good  estimate  when  the  input  signal  com¬ 
prises  of  spatially  localized  phenomena  such  as  edges.  Though  the  Fourier  domain  is 
still  the  ideal  domain  to  represent  the  noise  colored  by  the  inversion  of  H,  the  domain 
is  not  well-suited  to  represent  the  signal  of  interest  x.  The  Fourier  basis  functions  that 
are  exploited  by  the  Wiener  filter  have  support  that  extend  over  the  entire  domain. 
So,  scalar  operations  on  the  Fourier  coefficients  lack  any  spatial  localization  thereby 
resulting  in  all  spatial  components  of  the  input  signal  being  processed  uniformly. 

For  example,  images  can  been  modeled  as  piece- wise  smooth  functions,  an  impor¬ 
tant  class  of  spatially  varying  signals  [14].  For  discontinuous  signals  such  as  piece- wise 
polynomial  signals,  the  energy  of  the  signal  in  the  Fourier  domain  decays  very  slowly, 
0(  £),  as  the  number  of  samples  N  of  the  underlying  continuous  process  increases.  As 
a  result,  for  a  low-pass  H,  the  error  per  sample  decays  very  slowly,  0(n21+2)  where 
v  characterizes  the  low-pass  filter,  as  increasingly  dense  samples  of  the  underlying 

2Note  that  the  quasi-inverse  is  now  signal-dependent  and  hence  non-linear  in  the  strict  sense. 
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continuous  time  observation  are  obtained  (refer  Appendix  B). 

The  slow  error  decay  results  because  the  variance  of  noise  colored  by  H is  large 
at  high  frequencies  leading  to  excessive  coefficient  shrinkage  (refer  expression  (2.4)) 
during  Wiener  estimation.  As  a  result,  the  high  frequency  components  required  to 
reproduce  sharp  edges  in  the  estimate  are  lost.  The  significant  signal  energy  loss  at 
high  frequencies  reflects  in  the  form  of  an  unsatisfactory  estimate. 

2.4  Alternative  Solutions 

Consider  the  following  modified  weights  that  can  be  used  for  shrinking  the  Fourier 
coefficients  of  the  noisy  signal  obtained  after  inversion  of  R. 

„  \H(fn)\2\X(fu)\2 

°Un>'  \H(fn)[2  \X (/„)|2  +  a 

where  a  is  the  called  the  regularization  parameter  [15];  the  Wiener  deconvolution  filter 
uses  ol  —  1.  For  values  of  a  <  1,  Ra(fn )  shrink  the  Fourier  coefficients  of  the  noisy 
signal  X (fn)  obtained  after  pure  inversion  (refer  Eqn.  (1.3))  less  than  the  weights 
used  by  the  Wiener  filter.  As  a  result,  they  cause  less  distortion  and  consequently 
result  in  sharper  edges  in  the  estimate.  However,  small  a  values  back-fire  in  the  form 
of  higher  noise  levels  in  the  estimate.  On  the  other  hand,  for  values  of  a  >  1,  Ra{fn ) 
efficiently  suppresses  noise  but  at  the  cost  of  excessive  distortion  and  smearing  of 
edges.  Thus,  due  to  the  inherent  bias-variance  tradeoff,  simply  changing  the  amount 
of  Fourier-domain  shrinkage  does  not  provide  a  satisfactory  solution  to  the  problem 
of  preserving  spatially  varying  phenomena. 

Better  deconvolution  techniques  need  to  take  the  spatial  vagaries  of  the  signal  into 
account.  One  such  technique  is  to  employ  the  best  linear  estimator,  the  time-varying 
or  matrix  version  of  the  Wiener  inverse  [16].  However,  the  time- varying  Wiener  filter  is 
impractical  because  it  requires  the  input  signal  cross-correlation  matrix  which  in  turn 
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needs  precise  knowledge  of  the  spatially  varying  phenomena  of  the  signal.  Further, 
the  processing  is  computationally  intensive,  0(N3)  where  N  is  the  number  of  samples, 
because  the  estimator  possesses  no  special  structure. 

Deconvolution  is  equivalent  to  the  problem  of  estimating  the  signal  in  convolu- 
tionally  colored  noise.  In  Utopia,  we  would  come  up  with  a  transform  that  captures 
the  signal  energy  and  colored  noise  energy  in  separate,  distinguishable  coefficients. 
However,  this  is  impossible  for  most  interesting  problems.  Instead,  we  must  com¬ 
promise  and  find  a  domain  where  most  of  the  signal  energy  is  concentrated  in  a  few 
identifiable,  relatively  noise-free  coefficients  that  we  would  retain.  It  is  also  important 
that  the  domain  be  able  to  adapt  to  the  unknown  spatial  characteristics  of  the  signal. 
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Chapter  3 

Background  on  Wavelets  and  Signal  Estimation 

3.1  Wavelet  Transform 

The  discrete  wavelet  transform  (DWT)  represents  a  1-d  signal  x  in  terms  of  shifted 
versions  of  a  low-pass  scaling  function  <j>  and  shifted  and  dilated  versions  of  a  prototype 
bandpass  wavelet  function  ip  [6, 17].  For  special  choices  of  p  and  ip,  the  functions 


^hkif) 

:=  2j/2ip(2jt  -  k)  , 

(3.1) 

:=  2 9  <p(2H  —  k)  ,  j,k  E  Z, 

(3,2) 

form  an  orthonormal  basis,  and  we  have  the  representation  [6, 17] 

j 

x{t)  =  ^2uj0ik<pj0ik(t)  +  (3-3) 

k  j=jo  k 

with  Uj,k  :=  fx(t )  <p*jk(t)  dt  and  Wj '■=  fx(t)  ipj^it)  dt.  The  parameter  J  controls  the 
resolution  of  the  wavelet  representation. 

For  a  discrete-time  signal,  the  N  wavelet  coefficients  {uj0yk,Wj,k}  of  x(tn)  can  be 
easily  computed  using  a  filter  bank  consisting  of  low-pass  filters,  high-pass  filters, 
and  decimators  [17].  Due  to  the  special  filter  band  structure  the  forward  and  inverse 
wavelet  transform  can  be  computed  in  0(N )  operations,  where  N  is  the  length  of 
the  signal.  We  have  used  the  periodic  DWT,  which  uses  circular  convolutions  in  the 
filter  bank,  in  this  thesis.  For  brevity,  we  will  collectively  refer  to  the  set  of  scaling 
and  wavelet  coefficients  as  {%*;}  :=  {uj0,k->wj,k}-  Multidimensional  DWTs  are  easily 
obtained  by  alternately  wavelet-transforming  along  each  dimension  [6, 17]. 
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3.1.1  Multiresolution  and  time- frequency  localization  of  wavelets 

Wavelet  provide  a  multiresolution  representation  of  a  signal,  i.e.,  the  wavelet  co¬ 
efficients  capture  the  signal  features  at  different  resolution  levels.  In  the  wavelet 
representation  notation  used,  j  indexes  the  scale  or  the  resolution  of  analysis  -  larger 
j  corresponds  to  higher  resolution  of  analysis,  j  —  j0  indicates  the  coarse  scale  or 
lowest  resolution  of  analysis  and  j  —  J  indicates  the  fine  scale  or  highest  resolution 
of  analysis.  In  the  wavelet  representation,  k  indexes  the  spatial  location  of  analysis. 
For  a  wavelet  ip( t )  centered  at  time  zero  and  frequency  /0,  the  wavelet  coefficient  Wj ^ 
measures  the  signal  content  around  time  k  and  around  frequency  2jl  /0.  Thus, 
wavelets  exhibit  simultaneous  spatial  and  frequency  localization.  The  relation  of  the 
DWT  coefficients  to  tilings  in  a  time-frequency  plane  is  graphically  depicted  in  Fig¬ 
ure  3.1(a).  In  contrast,  Figure  3.1(b)  depicts  the  time- frequency  tiling  of  the  Fourier 
transform. 

3.1.2  Wavelets  as  Unconditional  Bases 

Wavelets  enjoy  the  following  important  property:  they  provide  an  unconditional  basis 
for  a  signal  classes  such  as  Besov  classes,  which  determines  membership  to  any  space 
in  the  class  based  on  the  signal  smoothness;  signal  with  different  smoothness  belong 
to  different  parameterized  spaces  in  the  Besov  class  [8].  For  example,  important 
spaces  such  as  Sobolev  spaces  and  Lp  spaces  belong  to  the  Besov  class.  In  essence, 
the  unconditional  basis  property  of  wavelets  means  that  signals  belonging  to  such 
classes  are  characterized  only  by  the  amplitudes  of  the  wavelet  coefficients  [7,8].  In 
contrast,  the  Fourier  basis  do  not  provide  an  unconditional  basis  for  such  a  wide  class 
of  signals. 

The  implications  of  this  abstract  notion  of  unconditional  basis  are  extremely  ap- 
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Figure  3.1  :  (a)  Time- frequency  tiling  for  DWT:  The  DWT  basis  function  corresponding 
to  the  fine  scale  wavelet  coefficients  103,1,103,2,  •  •  •  w3,s  are  well-localized  in  time  but  have 
large  bandwidths.  They  capture  the  high-frequency  characteristics  of  a  signal  at  different 
times.  The  DWT  basis  functions  corresponding  to  the  coarser  scales  wavelet  coefficients 
W‘2.\  , . . .  u'2.4  and  u'i.iauduji.2;  capture  the  intermediate  frequency  characteristics  of  a  sig¬ 
nal.  The  scaling  coefficients  uiyandui^  capture  the  lowest  frequency  characteristics  of  the 
signal  at  different  times,  (b)  Time-frequency  tiling  for  DFT:  In  contrast  to  the  DWT  basis 
functions,  the  DFT  basis  functions  have  support  over  the  entire  time  domain  but  are  per¬ 
fectly  localized  in  frequency.  The  DFT  coefficients  capture  the  overall  signal  energy  at  a 
particular  frequency  fn. 


pealing.  In  [18],  Donoho  shows  that  unconditional  basis  are  desirable  because  they 
typically  express  the  signal  very  economically  by  capturing  most  of  the  signal  en¬ 
ergy  in  a  few  coefficients.  The  wavelet  coefficients  of  signals  in  a  Besov  class  decay 
exponentially  with  scale  in  the  wavelet  domain! 

In  addition  to  having  direct  implications  on  data  compression,  economical  rep¬ 
resentations  are  also  desirable  for  signal  estimation  using  non-linear  approximation 
[19,20].  In  fact,  unconditional  basis  are  optimal  for  signal  estimation  and  data  com¬ 
pression  [18].  Further,  the  unconditional  basis  property  also  ensures  that  even  simple 
scalar  operations  on  each  wavelet  coefficient  is  sufficient  to  ensure  near-optimal  per¬ 


formances. 
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3.2  Signal  Estimation  by  Wavelet  Shrinkage 

Recently,  wavelets  have  provided  effective  solutions  to  the  problem  of  signal  estimation 
in  the  presence  of  noise  [5,9].  Consider  the  problem  of  signal  estimation  in  the 
presence  of  white  noise.  Many  real  world  signals  have  economical  wavelet-domain 
representations  where  a  few  large  wavelet  coefficients  capture  most  of  the  signal  energy 
[6,14],  However,  since  a  wavelet  transform  is  orthonormal  and  linear,  the  energy 
of  white  noise  remains  scattered  over  all  the  wavelet  coefficients.  This  disparity 
between  the  signal  and  noise  representation  in  the  wavelet  domain  has  been  exploited 
in  a  number  of  powerful,  near-optimal,  signal  estimation  techniques  based  on  simply 
shrinking  the  wavelet  coefficients  of  the  noisy  signal. 

Since  wavelets  provide  good  economical  representations  in  the  presence  of  spatial 
varying  features  of  the  signal,  signal  estimation  in  the  wavelet  domain  preserves  such 
features  more  effectively  than  conventional  LTI  techniques  such  as  estimation  methods 
based  on  windowed-SVD.  In  fact,  wavelet-domain  signal  estimation  techniques  are 
near-optimal  for  a  wide  class  of  signals  including  piece- wise  smooth  signals  and  signals 
in  the  Besov  class[5, 18]. 

Such  wavelet-domain  signal  estimation  techniques  based  on  shrinkage  can  also  be 
extended  to  estimate  signals  in  the  presence  of  colored  noise.  The  optimality  of  such 
a  wavelet-based  approach  now  becomes  dependent  on  the  coloring  of  the  noise  in 
addition  the  signal  class  [10] 
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Chapter  4 


Wavelet-based  Deconvolution  (WaD) 


4.1  The  WaD  Algorithm 

Since  Eqn.  (1.3)  and  Eqn.  (1.1)  are  equivalent,  deconvolution  is  equivalent  to  esti¬ 
mating  the  input  signal  in  the  presence  of  noise  colored  by  if-1.  The  near-optimality 
of  wavelet-based  techniques  for  a  wide  variety  cases  encountered  in  signal  estimation 
combined  with  the  appealing  ability  of  wavelets  to  adapt  to  unknown  spatial  features 
of  a  signal  has  inspired  the  use  of  wavelets,  particularly  when  the  signal  contains  spa¬ 
tially  localized  features,  to  solve  the  problem  of  deconvolution  in  the  following  way 
(refer  Figure  4.1). 

1.  Pure  Inversion:  Similar  to  Wiener  deconvolution,  the  first  step  involves  obtain 
a  noisy,  unbiased  estimate  x  of  the  input  signal  as  explained  in  Eqn.  (1.3). 

2.  Wavelet-based  Signal  Estimation:  In  contrast  to  Fourier-domain  shrinkage 
employed  by  the  Wiener  deconvolution  filter,  WaD  estimates  the  signal  from 


Figure  4.1  :  Block  diagram  of  Wavelet-based  deconvolution  (WaD):  Wavelet-domain  shrink¬ 
age  is  used  to  estimate  the  signal  in  the  presence  of  noise  colored  during  inversion  using 
H  \ 
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the  noisy  x  obtained  after  pure  inversion  by  shrinking  each  wavelet  coefficient 
of  the  noisy  signal1.  The  variance  of  noise  corrupting  all  wavelet  coefficients 
in  a  particular  scale  is  the  same,  but  the  variance  varies  with  different  scales 
[10].  So  scale-dependent  shrinkage  is  employed  to  estimate  the  signal  wavelet 
coefficients.  Inverse  DWT  is  then  used  to  obtain  the  wavelet-based  estimate 
from  the  estimated  signal  wavelet  coefficients. 

4.2  Optimality  of  WaD 

This  basic  idea  was  first  exploited  by  Donoho  who  studied  the  applications  of  wavelets 
to  linear  inverse  problems  and  showed  that  a  wavelet-based  deconvolution  approach 
is  near-optimal  to  recover  a  wide  class  of  signals  (eg.,  Besov  classes)  when  the  linear 
operator  %  satisfies 

h(t)  ©  y(at )  =  a P  h(at)  ©  y(t),  (4.1) 

for  some  exponent  0.  Such  an  operator  %  is  said  to  be  scale  invariant  or  dilation 
homogeneous.  The  Radon  transform  is  an  important  example  of  dilation  homoge¬ 
neous  operators  [7,  21].  To  the  best  of  our  knowledge,  the  optimality  of  wavelet-based 
deconvolution  for  general  dilation-inhomogeneous  operators  is  still  unknown.  Using 
wavelet-based  techniques,  Nowak  [12]  has  observed  impressive  results  for  some  com¬ 
mon  LTI  operators  as  well.  Mallat  [11]  has  also  advocates  a  similar  philosophy  and 
claimed  state-of-the-art  performances  in  satellite  image  recovery. 

We  again  consider  the  example  of  piece-wise  polynomial  functions  to  gain  in¬ 
sight  into  the  advantages  of  such  an  approach.  Wavelets  represent  an  length- iV 
piece- wise  polynomial  signal  very  economically  using  just  0(log2  N)  non-zero  wavelet 

1Note  that  different  shrinkage  techniques  can  be  used  on  the  noisy  wavelet  coefficients  but  the 
philosophy  remains  the  same:  Use  wavelet-domain  estimation  instead  of  Fourier-domain  estimation 
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coefficients2.  As  a  result,  assuming  a  invertible  low-pass  the  error  per  sample 
decays  very  rapidly,  O  (  1°e21A'  ) ,  as  increasingly  dense  samples  of  the  underlying  con- 

\  TV  ^»'+i  / 

tinuous  time  observations  are  obtained  (refer  Appendix  C).  The  error  decay  rate  of 
O  (  logVV  I  is  significantly  faster  than  O  (  11  )  that  is  achieved  by  Fourier-domain 

shrinkage. 

WaD  achieves  faster  error  decay  rates  because  the  energy  of  the  input  signal  gets 
concentrated  in  a  few  large  coefficients  in  the  wavelet  domain.  If  the  noise  variance 
in  these  large  signal  coefficients  is  not  excessive  then  identifying  and  retaining  these 
signal  coefficients,  and  setting  the  other  coefficients  to  zero  using  shrinkage  gives  an 
excellent  estimate.  Because  the  retained  signal  coefficients  capture  most  of  the  spatial 
features  such  as  edges  in  the  signal,  the  final  estimate  retains  its  sharp  edges  as  well 
as  noise-free  smooth  regions. 


4.3  Drawbacks  of  WaD 


However,  such  a  wavelet-based  approach  has  its  limitations.  These  can  be  understood 
by  focusing  on  the  variance  of  the  wavelet  coefficients  of  noise  colored  by  H  1 .  The 


noise  variance  o2  at  scale  j  can  be  approximately  by3 


n=2?  —1 


1 

WT  E 


n= 2i 


U  W» 


(4.2) 


where  H(fn)  are  the  discrete  Fourier  coefficients  of  the  blurring  operator  H,  and  a2 
is  the  variance  of  the  AWGN  7  (refer  Figure  4.2). 


2  Consider  a  wavelet  system  with  basis  functions  that  possess  number  of  zero-moments  that  is 
greater  than  or  equal  to  the  degree  of  all  the  polynomial  pieces.  If  the  support  of  any  wavelet  basis 
function  lies  within  any  polynomial  piece,  then  the  corresponding  wavelet  coefficient  is  zero. 

3  This  approximation  is  valid  only  when  the  zeros  in  the  frequency  response  H  are  of  sufficiently 


low  order. 
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Figure  4.2  :  (a)  The  solid  line  is  the  frequency  response  of  the  inverse  of  a  typical  lowpass 
system.  The  dot-and-dashed  lines  show  the  frequency  bands  corresponding  to  wavelet  basis 
functions  at  different  scales.  The  wavelet  basis  functions  are  nearly  band-limited  and  almost 
constant  within  dyadic  frequency  bands  (e.g.  -j-  to  p ).  These  dyadic  frequency  bands  are 
demarcated  in  the  figure  using  dotted  vertical  lines.  If  blurring  system  frequency  response 
does  not  have  a  high-order  zero  at  frequency  ^ .  the  variance  of  noise  colored  by  H  1  (solid 
line)  at  different  scales  is  primarily  determined  by  the  frequency  response  of  the  inverse 
within  the  corresponding  dyadic  frequency  band,  (b)  The  dot-and-dashed  lines  show  the 
frequency  responses  of  Mallat’s  mirror  wavelet  basis.  The  mirror  wavelet  basis  functions 
adopt  a  frequency  split  that  aims  to  isolate  the  singularities  in  the  inverse  system  frequency 
response  and  thereby  reduce  the  noise  variance  in  most  of  the  mirror  wavelet  basis  subbands. 
However,  if  the  singularity  is  not  located  at  a  dyadic  frequency  point  -p  p. . . .)  then 
the  non-zero  frequency  overlap  leads  to  infinite  noise  variance  at  all  scales  in  any  adapted 
wavelet  basis  system.  The  same  problem  also  crops  up  when  the  blurring  system  response 
has  a  high  order  zero.  So  an  adapted  wavelet  basis  system  is  not  effective  for  all  convolution 
operators. 
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From  the  expression  (4.2),  it  is  clear  that  even  if  H(fn)  is  small  at  any  isolated 
frequency  fn,  the  variance  of  the  colored  noise  in  the  entire  corresponding  wavelet 
scale  explodes.  Consequently,  the  task  of  extracting  the  signal  coefficient  from  this 
scale  using  any  form  of  scalar  shrinking  in  the  wavelet  domain  is  rendered  ineffective. 
As  shown  in  Figure  4.2,  wavelet  basis  functions  are  not  exactly  band-limited;  the  DFT 
coefficients  of  the  operator  outside  the  band  2 !J"_1  to  2-7  —  1  also  contribute  to  the  noise 
variance  but  to  a  lesser  extent.  However,  if  the  the  operator  contains  singularities 
that  are  placed  at  non-dyadic  frequencies,  then  the  noise  variance  is  infinite  at  all  the 
wavelet  scales.  In  such  a  case,  WaD  provides  a  zero  estimate. 

4.4  Best-Basis  Solution  Improves  Performance 

Mallat  advocates  adapting  the  wavelet  basis  to  the  frequency  response  of  the  inverse 
of  H.  to  improve  on  the  performance  of  WaD  [6, 11].  For  a  convolution  system  with 
frequency  response  shown  in  Figure  1.2(c),  the  adaptation  is  achieved  by  using  a 
mirror- wavelet  basis  that  possess  a  time-frequency  tiling  structure  different  from  that 
possessed  by  conventional  wavelets  (refer  Figure  4.2).  The  tiling  structure  used  by  the 
mirror  wavelet  basis  functions  aims  to  isolate  the  frequency  where  the  convolution 
operator  H[f)  approaches  zero  because  the  variance  of  the  noise  at  any  scale  is 
primarily  influenced  by  the  singularities  of  H~1(f)  that  lie  in  the  frequency  band 
corresponding  to  that  wavelet  scale.  An  estimate  obtained  by  using  such  an  approach 
is  shown  in  Figure  1.2(f). 

However,  it  is  not  always  possible  to  come  up  with  a  set  of  basis  functions  that 
can  isolate  all  such  singular  frequencies  in  H  1  because  the  wavelet  basis  functions 
are  not  exactly  band-limited.  For  example,  any  adapted  wavelet  basis  scheme  cannot 
provide  a  satisfactory  solution  when  the  blurring  operator  has  a  boxcar  system  impulse 
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response;  the  frequency  response  of  the  7~L  is  the  sine  function.  In  such  a  case,  the 
variance  of  noise  colored  by  H  1  is  high  in  all  the  wavelet  scales,  rendering  signal 
estimation  in  adapted  basis  domain  ineffective. 

Recollect  that  during  inversion  using  if-1,  only  the  noise  components  correspond¬ 
ing  to  the  small  Fourier  coefficients  of  the  convolution  operator  get  amplified.  So  it  is 
natural  to  exploit  the  structure  of  the  operator  using  the  Fourier  domain  to  restrict 
excessive  noise  amplification  in  addition  to  using  the  frugal  wavelet  representation  of 
the  signal.  This  motivates  our  hybrid  approach  to  deconvolution. 
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Chapter  5 

Wavelet-based  Regularized  Deconvolution  (WaRD) 

In  WaD,  estimating  the  signal  coefficients  in  the  wavelet  domain  is  reliable  only  if 
the  colored  noise  does  not  corrupt  the  signal  wavelet  coefficients  heavily.  The  noise 
variance  in  the  wavelet  domain  can  be  significantly  reduced  by  attenuating  the  am¬ 
plified  Fourier  components  of  noise.  This  attenuation  can  be  easily  achieved  by  using 
a  small  amount  of  Fourier-domain  regularization  during  inversion.  So,  we  propose 
Wavelet-based  Regularized  Deconvolution  (WaRD)  that  simultaneously  exploit  the 
spatial  adaptivity  of  wavelets  and  diagonalized  Fourier  domain  representation  of  the 
convolution  operator  to  solve  the  deconvolution  problem. 

5.1  The  WaRD  Algorithm 

The  WaRD  algorithm  consists  of  the  following  steps. 

1.  Regularized  Fourier-domain  Inversion:  Instead  of  using  a  pure  inverse,  we 
use  a  LTI  Fourier-domain  regularized  filter.  As  shown  in  Figure  5.1,  this  filter 
partially  attenuates  noise  by  mildly  shrinking  the  noisy  coefficients  obtained 
after  inversion  of  R.  in  the  Fourier-domain  using  weights  Ra{fn )  (refer  Eqn. 
(2.7)).  The  signal  obtained  after  quasi-inversion  is 

Xa(fn)  ~  ))  R<x(fn)^'  ifn)  (5-1) 

=  Mfn)X(fn)+(-t—\  RMn)  !-(/„)  (5.2) 
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Fourier-domain  regularized  inversion  Wavelet-domain  estimation 


Figure  5.1  :  Block  diagram  of  Wavelet-based  regularized  deconvolution  (WaRD):  The  first 
block,  Fourier  domain  regularized  inversion,  performs  inversion  followed  by  partial  Fourier- 
domain  regularization  to  achieve  partial  noise  attenuation.  Wavelet-domain  estimation  that 
is  based  on  shrinkage  of  wavelet  coefficients  ofx  subsequently  estimates  the  signal  in  the 
presence  of  noise  colored  during  regularized  inversion. 

where  the  parameter  a  that  controls  the  amount  of  Fourier-domain  shrinkage 
employed  during  quasi-inversion,  is  typically  small  (a  <  1).  The  choice  of  a  is 
discussed  in  the  later  sections. 

2.  Wavelet-domain  Signal  Estimation:  Since  the  amount  of  regularization 
used  in  the  first  stage  is  small,  the  estimate  Xa  obtained  after  regularized  in¬ 
version  still  contains  some  residual  noise.  We  obtain  the  signal  estimate  by 
shrinking  wavelet  coefficients  of  Xa  at  each  scale  according  to  the  noise  vari¬ 
ance  at  that  scale.  We  then  employ  the  inverse  DWT  on  the  estimated  signal 
wavelet  coefficients  to  obtain  the  WaRD  estimate  of  the  input  signal  (refer  Fig¬ 
ure  5.1).  We  emphasize  that  wavelet-domain  signal  estimation  remains  effective 
since  most  of  the  signal  energy  is  still  captured  by  just  a  few  large  wavelet  co¬ 
efficients  since  the  amount  of  distortion  introduced  by  Ra(fn)  is  minimal  and 
the  noise  in  these  wavelet  coefficients  is  not  excessive,  thanks  to  Fourier-domain 
regularization  used  during  inversion. 
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5.2  Tradeoff:  Distortion  vs.  Noise  Attenuation 

Noise  reduction  using  Fourier-domain  regularization  comes  at  the  cost  of  signal  dis¬ 
tortion  and  hence  needs  to  be  controlled.  In  other  words,  this  raises  the  question: 
how  to  pick  the  right  value  for  the  regularization  parameter  oP.  The  tradeoff  is  clear: 
On  one  hand,  since  regularization  smears  non-stationary  signal  features  like  edges  and 
ridges,  we  would  prefer  a  as  small  as  possible.  On  the  other  hand,  large  a  prevents 
excessive  noise  amplification  during  inversion  which  aids  the  wavelet-domain  signal 
estimation.  Thus  a  controls  the  bias-variance  tradeoff  in  the  WaRD  system. 

We  seek  to  determine  the  optimal  regularization  parameter  for  the  WaRD  system 
by  minimizing  the  overall  MSE.  This  optimal  amount  of  regularization  would  balance 
the  amount  of  noise  reduction  carried  out  in  the  Fourier  domain  and  the  wavelet 
domain.  The  overall  MSE  is  well-approximated  by  a  proposed  cost  function  that 
includes  the  distortion  error  due  to  the  Fourier-domain  regularized  inversion  stage 
and  error  incurred  during  wavelet-domain  signal  estimation  stage. 

5.3  The  Cost  Function 

We  derive  the  cost  function  that  is  approximately  equal  to  the  MSE  of  the  WaRD 
estimate.  The  cost  function  assumes  that  ideal  thresholding  T(.)  is  employed  during 
signal  estimation  in  the  wavelet  domain.  Ideal  thresholding  keeps  a  noisy  wavelet 
coefficient  only  if  the  signal  power  in  that  coefficient  is  greater  than  the  noise  power. 
Otherwise,  the  coefficient  is  set  to  zero. 

j  Oj,k>  if  \ej,f  >  aj(a) 

1  0,  if  lOfJ2  <  a]  (a) 

where  9°k  are  the  wavelet  coefficients  of  the  noisy  signal  Xa(fn )  obtained  after  par¬ 
tially  regularized  inversion  in  (5.2),  0"fc  are  the  wavelet  coefficients  of  the  the  distorted 


(5.3) 


27 


but  noiseless  input  signal  Ra(fn )  X  (/„),  and  oj(a)  is  the  variance  of  noise  (/„) 

at  wavelet  scale  j.  Ideal  thresholding  assumes  that  the  signals  under  consideration 
are  known. 

The  deconvolution  MSE  for  the  WaRD  system  approximately  consists  of  the  signal 
distortion  due  to  Fourier-domain  regularization  and  the  error  due  ideal  thresholding 
that  is  used  to  achieve  wavelet-domain  noise  shrinkage.  So  we  propose  a  cost  function 
MSE(a;)  given  by 

n=N 

MSE(a)  :=  £ [1  -  Ra(fn)f  |A'(/„)|2  +  £  mm(|0“J2,  a](a)) .  (5.4) 

n=  1  j,k 

Here  0"fe  denotes  the  wavelet  coefficients  of  the  input  signal  distorted  by  Fourier 
domain  regularization,  Ra(fn)  X(fn),  and  <7? (a)  is  the  variance  of  colored  noise 
r (/„)  at  scale  j.  The  hrst  term  in  MSE(o;)  is  an  estimate  of  the  distortion 
in  the  input  signal  due  to  regularized  inversion  [15].  This  distortion  error  is  an  in¬ 
creasing  function  of  a.  The  second  term  is  an  estimate  of  the  error  due  to  ideal 
wavelet-domain  thresholding  [22],  This  thresholding  error  is  a  decreasing  function  of 
a  since  both,  the  signal  energy  and  the  noise  variance,  reduce  with  increase  in  a. 
The  optimal  regularization  parameter,  denoted  by  a*,  corresponds  to  the  minimum 
of  MSE(a). 

5.4  Accuracy  of  the  cost  function 

We  now  show  that  the  proposed  cost  function  MSE(o;)  approximates  the  actual  MSE 
of  a  WaRD  system  well.  The  cost  function  assumes  that  the  total  error  is  composed 
of  independent  contributions  from  the  distortion  error  incurred  during  the  Fourier- 
domain  regularization  stage  and  the  error  due  to  subsequent  ideal  wavelet-domain 
thresholding.  We  analyze  two  following  two  cases  and  their  effects  on  the  cost  function 
MSE(cr). 
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j Ojk\2  >  <Jj(a):  When  a  wavelet  coefficient  of  Ra{fn)X(fn)  has  greater  energy  than 
the  noise  variance  aj  (a) ,  ideal  thresholding  retains  the  corresponding  noisy 
wavelet  coefficient  intact  (refer  (5.3)).  In  this  case,  the  error  contribution  due 
to  ideal  thresholding  error  term  in  MSE(a;)  is  aj  (a) ,  which  is  exactly  equal  to 
the  actual  MSE. 

\@j,k\'2  —  aj(a):  However,  when  \9^k\2  <  a|(a;),  ideal  thresholding  sets  the  correspond¬ 
ing  wavelet  coefficient  to  zero.  The  actual  error  is  the  energy  of  the  wavelet  coef¬ 
ficient  of  the  original  signal  x,  \6j,k\2.  However,  the  contribution  of  the  proposed 
cost  function  MSE(a)  is  — 0“fe|2+  | 2 .  This  contribution  is  approximately 

equal  to  9j_k | 2  when  the  distortion  | Qjjk  —  9jk j2  is  small  in  comparison  to  \9fjk\2- 
So,  for  small  values  of  a,  which  typically  is  the  case,  the  distortion  is  com¬ 
paratively  small  and  hence  the  cost  function  is  fairly  accurate.  The  proposed 
cost  function  is  always  within  a  factor  of  the  error  2  within  the  actual  error  for 
any  positive  value  of  a.  However,  we  believe  that  these  extreme  cases  are  not 
generally  encountered  in  practice. 

So  the  proposed  cost  function  is  a  good  approximation  of  the  actual  error  incurred 
(assuming  ideal  thresholding);  hence  MSE(ct)  can  be  minimized  to  find  the  regular¬ 
ization  parameter  that  balances  Fourier-domain  regularized  inversion  and  effective 
wavelet-domain  signal  estimation. 

5.5  Optimal  a  for  each  Scale 

An  interesting  generalization  to  having  a  single  Fourier-domain  regularized  inverse 
filter  for  all  subbands  is  to  have  Fourier-domain  regularized  inverse  filters  with  a 
different  regularization  parameter  for  every  wavelet  scale  (refer  Figure  5.2).  In  this 
case,  we  need  to  determine  the  optimal  regularization  parameter  for  each  wavelet 
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Figure  5.2  :  Different  regularization  parameters  for  each  wavelet  scale:  In  the  figure,  the 
filters  ho,  hi  and  h-2  are  the  equivalent,  cumulative  filters  at  each  scale  in  a  2-level  decom¬ 
position  DWT  filter-bank.  Such  an  equivalent  filter-bank  representation  can  be  obtained 
by  transforming  the  filters  in  the  filter  bank  using  the  noble  identities  [23,  pp.  119]  and 
then  grouping  the  synthesis  filters  and  the  decimators  together.  Different  regularization 
parameters,  oq,  ot\,  and  a-2  are  used  at  each  scales  as  shown. 


subband.  Such  a  generalization  helps  making  the  cost  function  separable  with  respect 
to  the  regularization  parameter  of  each  scale  thereby  simplifying  the  analysis.  It  turns 
out  (refer  appendix  A)  that  the  optimal  regularization  parameter  a*  for  the  scale  p 
satisfies 

aP  ~  (iCfcl  >  a?K))  >  (5-5) 

where  denotes  the  wavelet  coefficients  of  the  distorted  input  signal  Ra{fn )  X(fn) 

at  scale  p,  and  a p(ap)  is  the  variance  of  noise  (  jT( /  "f )  r(/n)  at  scale  p. 
#(lC*l  >  cr'p(ap)']  denotes  the  number  of  wavelet  coefficients  6ppk  that  have  energy 
greater  than  the  noise  variance  ap(ap). 

Expression  (5.5)  reveals  that  the  selection  of  the  optimal  regularization  parameter 
is  heavily  influenced  by  the  distribution  of  the  wavelet  coefficients  relative  to  the 
variance  of  the  colored  noise.  Expression  (5.5)  is  a  recursive  expression  since  both 
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sides  of  the  equation  depend  on  a*. 

Consider  a  signal  such  that  the  signal  energy  in  some  wavelet  scale  p  is  concen¬ 
trated  in  very  small  proportion  of  the  wavelet  coefficients  at  that  scale,  i.e.  few 
wavelet  coefficients  are  large  while  most  of  the  other  wavelet  coefficients  are  nearly 
zero.  This  economical  distribution  does  not  change  significantly  when  the  signal  is 
distorted  by  using  a  small  amounts  of  regularization  because  the  derivative  of  the 
total  distortion  introduced  at  zero  regularization  is  0  (put  ap  =  0  in  Eqn.  (A. 5)).  On 
the  other  hand,  from  Eqn.  (A.  16),  one  can  infer  that  the  rate  of  decrease  of  the  noise 
variance  in  the  wavelet  scale  is  maximum  at  ap  =  0.  So,  the  variance  of  the  colored 
noise  in  this  scale  decreases  rapidly  at  small  values  of  ap.  The  optimal  regularization 
parameter  condition  suggests  that  the  noise  variance  be  reduced  using  regularization 
so  that  a  sufficient  number  of  wavelet  coefficients  of  the  signal  stand  out  above  the 
noise  variance. 

We  emphasize  that  the  optimal  regularization  parameter  a *  is  never  zero.  If 
olp  =  0  satisfies  the  optimality  condition,  it  would  imply  that  the  noise  variance 
is  higher  than  the  energy  of  all  the  wavelet  coefficients.  Hence  using  ap  =  0  would 
result  in  all  wavelet  coefficients  at  scale  p  being  shrunk  to  zero  during  wavelet-domain 
estimation  stage!  For  most  signals,  a  significant  proportion  of  the  wavelet  coefficients 
are  very  small;  so  ap  =  1  will  not  satisfy  (5.5).  Hence  ap  =  1  also  is  not  an  optimal 
choice  for  the  regularization  parameter. 

Several  values  of  ap  could  potentially  satisfy  the  optimality  criterion.  The  choice 
of  the  optimal  regularization  parameter  in  such  a  case  is  not  clear.  However,  this  is 
not  a  significant  issue  in  practice  since  choosing  any  ap  sufficiently  greater  than  zero 
(ap  ~  0.2)  was  found  to  provide  estimates  comparable  to  that  obtained  by  choosing 
the  optimal  a. 

Thus,  the  expression  for  the  optimal  regularization  parameter  quantifies  the  no- 
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tion  of  balancing  Fourier-domain  inversion  and  wavelet-domain  estimation  within  a 
wavelet-based  deconvolution  approach. 

5.6  Optimality  of  WaRD 

We  emphasize  that  the  existing  wavelet-based  deconvolution  algorithms  of  [6,  7,  24] 
are  special  cases  of  WaRD1  with  a  =  0.  By  construction,  WaRD  includes  the  value 
a  =  0  in  the  search-space  for  the  optimal  a*  that  it  uses.  Hence  WaRD  enjoys  all 
the  good  properties  of  Donoho’s  wavelet-based  deconvolution  such  as  optimal  rate  of 
error  decay  for  dilation-homogeneous  operators. 

The  optimal  a *  is  never  exactly  zero  at  a  finite  resolution  N  (though  it  can  tend 
to  zero  with  increasing  TV).  Hence,  WaRD  will  actually  outperform  wavelet-based 
deconvolution  methods  described  in  [6,  7,  24]  in  MSE-terms  at  a  given  resolution. 

Techniques  based  on  conventional  wavelet-based  deconvolution  [7, 11]  described  in 
Section  4  are  in  general  not  applicable  when  %  is  not  invertible.  However,  thanks  to 
optimal  amount  of  regularization  WaRD  uses,  WaRD  give  excellent  estimates  even 
when  %  is  not  invertible. 


1For  comparisons  with  [6],  the  use  of  a  similar  adapted  wavelet  basis  would  be  required 
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Chapter  6 

WaRD  implementation 

Several  issues  crop  up  during  the  implementation  of  the  WaRD  algorithm  in  practice. 

6.1  Choice  of  a 2  and  |X(/)|2 

The  WaRD  algorithm  as  outlined  in  Section  5  assumes  knowledge  of  the  variance  a2 
of  additive  noise  7  and  the  Fourier  spectrum  \X(f)\2  of  the  input  signal.  However, 
these  are  typically  unknown  in  practice  and  hence  need  to  be  estimated.  The  variance 
of  7  can  be  reliably  estimated  from  the  observation  y  by  using  a  median  estimator 
in  the  finest  wavelet  scale [5].  The  choice  of  \X(f)\2  is  however  more  tricky.  Any 
crude  estimate  of  the  spectrum  of  \X(f)\2  typically  has  attenuated  high-frequency 
components.  Employing  such  an  estimate  during  regularized  inversion  step  in  WaRD 
would  eliminate  all  the  high  frequency  components  in  the  final  estimate  as  well.  So 
we  believe  that  the  use  of  prior  information  on  \X(f)\2  instead  of  an  estimate  is 
appropriate.  For  the  purpose  of  understanding  the  WaRD  algorithm  and  gauging  its 
potential,  we  have  assumed  knowledge  of  a2  and  \X(f)\2  in  our  experiments. 

6.2  Empirical  Estimation  of  a 

Condition  (5.5)  provides  the  necessary  condition  for  the  regularization  parameters 
used  during  regularized  inversion  to  optimally  balance  Fourier-domain  regularization 
and  wavelet-domain  estimation.  However,  this  criterion  cannot  be  used  in  practice 
because  the  expression  assumes  knowledge  of  the  distorted  wavelet  coefficients  of 
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(a)  (b) 

Figure  6.1  :  (a)  The  MSE  of  WaRD  estimates  is  plotted  against  the  regularization  param¬ 
eter  a  for  different  images.  Ideally,  we  would  like  to  choose  the  regularization  parameter 
that  results  in  the  lowest  MSE  (denoted  by  the  small  circles).  Note  that  the  MSE  does  not 
change  significantly  when  any  a  slightly  higher  the  optimal  value  is  chosen,  (b)  The  norm 
of  the  WaRD  estimate  obtained  at  different  regularization  parameters  is  plotted  against  a. 
Note  that  the  a  that  achieves  the  lowest  MSE  is  located  near  the  knee  point  of  the  curve. 
So  in  practice,  near-optimal  performance  can  be  achieved  by  choosing  any  a  near  the  knee 
point  of  the  norm  curve. 


the  unknown  original  signal.  Further,  the  derivation  for  the  optimal  regularization 
parameter  is  derived  assuming  ideal  thresholding  which  cannot  be  implemented  in 
practice.  In  practice,  the  optimal  regularization  parameter  is  also  influenced  by  the 
wavelet-domain  estimation  algorithm  used  in  the  WaRD  system. 

This  raises  the  necessity  to  resort  to  empirical  techniques  to  find  the  regularization 
parameters.  In  the  course  of  our  simulations,  we  observed  that  the  improvement 
gained  by  using  different  empirical  regularization  parameters  for  each  wavelet  scale  is 
almost  negligible.  Hence  we  focus  on  estimating  an  optimal  regularization  parameter 
that  is  common  for  all  wavelet  scales. 

Figure  6.1(a)  shows  the  plot  of  the  MSE  of  WaRD  system  against  Fourier-domain 
regularization  parameter  a  for  different  images.  We  observe  that  the  typical  MSE 
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decreases  very  rapidly  when  a  is  small  but  changes  gradually  thereafter.  Further  we 
can  also  infer  that  choosing  any  a  sufficiently  large  gives  near-optimal  MSE  perfor¬ 
mances.  The  robustness  of  the  WaRD  system  performance  to  small  changes  in  the 
regularization  parameter  about  the  optimal  value  makes  the  estimation  of  a  near- 
optimal  regularization  parameter  easy.  We  exploit  this  robustness  to  come  up  with  a 
simple  scheme  to  estimate  the  near-optimal  regularization  parameter. 

Figure  6.1(b)  shows  the  typical  plot  of  the  l2  norm  of  the  WaRD  estimate  against 
Fourier-domain  regularization  parameter  a  for  the  same  images.  We  observe  that 
similar  to  the  behavior  of  the  MSE,  the  WaRD  solution  norm  also  changes  very 
rapidly  when  a  is  small  but  the  change  is  gradual  thereafter.  The  similarity  in  the 
trend  observed  in  Figure  6.1(a)  and  (b)  is  expected  because  the  MSE,  WaRD  estimate 
norm  and  the  actual  signal  norm  are  related  to  each  other  via  the  triangle  inequality. 
Since  the  the  error  of  the  estimate  is  essentially  unchanged  when  the  norm  of  the 
solution  remains  practically  constant,  we  can  empirically  determine  the  regularization 
parameter  by  choosing  an  a  near  the  knee  point  in  the  plot  of  the  WaRD  solution 
norm  against  a.  Such  an  empirically  determined  a  gives  near-optimal  results. 

Our  experience  showed  that  due  to  the  robustness  of  the  WaRD  system  to  the 
choice  of  the  a,  even  such  a  search  can  be  avoided.  Simply  choosing  a  ~  0.2  gave 
satisfactory  results  for  a  wide  class  of  images. 

6.3  Choice  of  Wavelet-domain  Estimation  Scheme 

A  variety  of  wavelet-domain  estimation  techniques  can  be  used  in  practice.  The 
choice  of  a  good  wavelet-domain  estimation  scheme  influences  the  final  performance 
significantly.  Commonly  studied  wavelet-domain  shrinkage  schemes  include  hard- 
thresholding  [25],  soft-thresholding  [26]  and  a  variety  of  other  similar  techniques 
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based  on  non-linear  wavelet  domain  shrinkage  [27,28].  Recently,  the  wavelet-domain 
Wiener  shrink  estimation  algorithm  was  proposed  by  Ghael  et.  al.  [29]  and  analyzed 
further  by  Choi  [30].  It  was  observed  that  this  technique  outperforms  the  conven¬ 
tional  wavelet-domain  estimation  schemes  that  employ  hard  and  soft  thresholding  in 
the  MSE  sense.  Estimation  using  a  wavelet-domain  Wiener  shrinkage  consists  of  the 
following  steps:  First  obtain  a  rough  estimate  of  the  input  signal  using  conventional 
wavelet-domain  techniques.  Use  this  estimate  to  obtain  a  final  refined  estimate  by 
employing  a  Wiener  filter  on  each  wavelet  coefficient. 

Significant  improvements  can  also  be  obtained  by  using  a  redundant,  shift- 
invariant  DWT.  Wavelet-domain  estimation  schemes  that  use  the  DWT  are  not  shift- 
invariant,  shifts  of  y  will  result  in  different  estimates.  The  redundant,  shift-invariant 
DWT  yeilding  significantly  improved  estimation  [31,32]  by  simply  averaging  over  all 
possible  shifts  of  the  observation  y  at  no  significant  increase  in  the  overall  computa¬ 
tional  cost.  So  we  employ  a  redundant,  shift- invariant  DWT  with  a  wavelet-domain 
Wiener  shrinkage  to  estimate  the  input  signal. 
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Chapter  7 
Results 

We  illustrate  the  performance  of  the  WaRD  algorithm  using  simulations  in  1-d  and 
2-d. 

We  first  compare  the  WaRD  with  other  methods  for  the  1-d  deconvolution  problem 
presented  in  the  Introduction.  In  order  to  observe  the  behavior  of  each  method  for 
both  smooth  and  edgy  regions,  we  take  for  x  a  concatenation  of  Donoho’s  Blocks 
and  Heavisine  signals  [5]  (normalized  to  be  zero  mean  and  unit  energy).  We  employ 
Daubechies  length-4  wavelets  throughout.  Figure  1.2(a)  depicts  the  signal.  For  the 
system,  we  take  the  example  of  [6,  pp.  459] 

[  1,  I/I  e  [0,0.25] 

H(f)  =  \  (7.1) 

[  2-4|/|,  l/j  e  (0.25,0.5] 

with  DFT  frequencies  /  normalized  to  (—0.5,  0.5]  (see  Figure  1.2(b)).  The  corrupting 
noise  variance  was  a2  —  4  x  10~6.  Figure  1.2(c)  plots  the  blurred,  noisy  signal.  The 
inverse  filter  frequency  response  H amplifies  all  high  frequency  noise  components 
(see  Figure  1.2(d)). 

Use  of  a  pure  inverse  T  amplifies  all  the  high  frequency  noise  components  thereby 
giving  an  extremely  noisy  estimate  (see  Figure  1.2(e))  of  the  input  signal.  The  Wiener 
filter  estimate  (Figure  1.2(f))  was  implemented  using  Px(f)  =  \X(f)\2  in  (2.4).  The 
Wiener  filter  bases  its  deconvolution  on  the  signal-to-noise  ratio  at  each  frequency. 
The  Fourier-domain  that  is  exploited  by  the  Wiener  filter  is  not  well-suited  to  rep¬ 
resent  the  signal  containing  spatially  localized  features  such  as  edges.  Consequently, 
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the  Wiener  deconvolution  is  ineffective. 

WaD  also  fails  to  provide  a  satisfactory  solution  in  this  case.  Due  to  the  null  in 
the  frequency  response  of  the  system  at  77(0.5)  =  0,  all  the  wavelet  coefficients  at  the 
finest  scale  are  corrupted  by  infinite  variance  noise  after  inversion  using  H  x\  hence 
these  coefficients  are  lost  during  wavelet-domain  estimation. 

The  mirror  wavelet  basis  deconvolution  method  of  [6, 11]  adapts  a  wavelet  packet 
basis  to  the  colored  noise  in  the  inverted  data  X(f)  =  Y(f).  The  frequency 

splits  of  the  mirror  wavelet  basis  for  H  are  shown  with  dashed  lines  in  Figure  1.2(d). 
The  adapted  basis  aims  to  isolate  the  singularity  at  /  =  0.5.  The  signal  estimation 
step  was  implemented  by  hard-thresholding  the  coefficients  of  a  shift-invariant  mirror 
wavelet  basis.  This  algorithm  outperforms  the  methods  of  [7,  24]  and  the  standard 
Wiener  filter  in  this  case  (see  Figure  1.2(g))1. 

Figure  1.2(h)  plots  the  WaRD  estimate  obtained  using  a *  =  0.06.  Wavelet-domain 
Wiener  shrinkage  was  applied  to  the  coefficients  of  a  shift- invariant  DWT  for  the 
wavelet-domain  signal  estimation  stage.  The  crude  estimate  required  by  the  wavelet- 
domain  Wiener  shrinkage  was  obtained  by  using  shift-invariant  hard  thresholding 
using  Daubechies  length-6  wavelets.  WaRD  outperforms  the  other  algorithms  in 
terms  of  both  estimate’s  visual  quality  and  MSE. 

Next,  we  consider  image  restoration  using  WaRD.  The  input  x  is  the  256  x  256 
Lena  image  (normalized  to  zero  mean  and  unit  energy)  and  the  discrete-time  system 
response  h  is  a  2-d,  4  x  4-point  smoother  [111  1  ]T  [1  1  1  1],  Such  a  response  is 
commonly  used  as  a  model  for  blurring  due  to  a  square  scanning  aperture  such  as  in  a 

xEven  though  H(f)  is  not  invertible,  the  wavelet  packet  deconvolution  method  does  not  fail, 
because  the  location  and  the  order  of  the  zero  of  H  are  such  that  only  a  few,  high-frequency  signal 
wavelet  coefficients  are  obliterated  by  the  infinite  noise  amplification  at  /  =  0.5.  This  method  will 
fail  when  H(f)  has  zeros  of  arbitrary  location  and  order,  however. 
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CCD  camera  [2],  The  noise  variance  was  set  to  a2  =  9.6  x  10~7.  Figure  7.1  illustrates 
the  desired  x.  the  observed  y,  the  Wiener  filter  estimate  xX}  and  the  WaRD  estimate 
for  a*  =  0.2.  The  value  for  a  was  chosen  to  be  close  to  the  knee  point  in  the  graph 
shown  in  Figure  6.1(b).  The  methods  of  [6,  7]  are  not  applicable  in  this  situation,  due 
to  the  many  zeros  in  H(fx,fy).  (The  Wiener  filter  outperformed  Mallat’s  adapted 
wavelet  basis  in  this  case.)  The  WaRD  estimate  is  clearly  better  that  the  Wiener 
estimate  in  overall  visual  quality  and  MSE.  However,  we  point  out  that  the  lines  on 
the  hat  are  almost  lost  in  the  WaRD  estimate  in  comparision  to  the  Wiener  estimate. 

The  difference  in  the  quality  of  the  estimates  obtained  using  the  Wiener  filter 
and  WaRD  can  be  grasped  by  viewing  the  cross-sections  through  the  images.  Figure 
7.2(a),  (b),  (c),  and  (d)  shows  the  cross-sections  through  row  220  of  the  original 
image,  the  observed  image,  the  Wiener  estimate  and  the  WaRD  estimate  respectively. 
The  Wiener  estimate  cross-section  shown  in  Figure  7.2(d)  illustrates  the  failure  of 
the  Wiener  technique  to  adapt  to  the  smooth  regions  and  the  edges  in  the  image 
simultaneously.  This  lack  of  spatial-localization  reflects  as  ripples  in  the  Wiener 
estimate.  In  contrast,  Figure  7.2(d)  clearly  illustrates  the  spatial-adaptivity  of  WaRD. 
We  observe  the  smooth  regions  and  the  edges  are  simultaneously  preserved. 
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(a) 


(b) 


(c)  (d) 

Figure  7.1  :  (a)  Lena  image  x  (256  x  256).  (b)  Observed  image  y  (MSE=0.2473).  (c) 
Wiener  biter  estimate  (MSE=0.0498).  The  ripples  in  the  image  result  because  the  Fourier 
basis  used  by  the  Wiener  filter  have  support  over  the  entire  spatial  domain,  (d)  WaRD  with 
a  —  0.2  (MSE=0.0427).  In  contrast  to  the  Wiener  estimate,  the  smooth  regions  and  most 
edges  are  well  preserved  in  the  WaRD  estimate,  thanks  to  the  spatially-localized  wavelet 
basis  functions.  However,  some  faint  features  such  as  the  lines  on  the  hat  are  lost  during 
wavelet  domain  estimation. 
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(a) 


(b) 


(c) 


(d) 

Figure  7.2  :  (a)  Cross-section  of  original  image  x  (row  220  from  Figure  7.1a).  The  image 
cross-section  contains  smooth  regions  and  discontinuities,  (b)  Cross-section  of  the  blurred 
and  noisy  observed  image  y.  (c)  Cross-section  of  estimate  obtained  using  the  spatially  in¬ 
variant  Wiener  technique.  The  Wiener  deconvolution  estimate  exhibits  ringing  artifacts, 
(d)  Cross-section  of  the  hybrid  WaRD  estimate.  Controlled  Fourier-domain  regulariza¬ 
tion  ensures  that  residual  noise  can  be  effectively  tackled  by  subsequent  spatially-adaptive 
wavelet  domain  estimation.  WaRD  preserves  smooth  regions  and  edges  simultaneously  even 
when  the  blurring  system  H  is  non-invertible. 
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Chapter  8 
Conclusions 

In  this  thesis,  we  have  proposed  an  efficient  multi-scale  deconvolution  algorithm 
WaRD  that  optimally  combines  Fourier-domain  regularized  inversion  and  wavelet- 
domain  signal  estimation.  The  WaRD  can  be  potentially  applied  in  a  wide  variety  of 
applications  such  as  satellite  imagery,  seismic  deconvolution  and  channel  equalization 
to  obtain  enhanced  performances. 

For  spatially  varying  signals,  the  WaRD  outperforms  the  LTI  Wiener  filter  and 
other  wavelet-based  deconvolution  algorithms  in  terms  of  both  visual  quality  and  MSE 
performance.  Since  WaRD  subsumes  WaD,  WaRD  also  enjoys  optimal  rates  of  error 
decay  with  increasing  samples  for  convolution  operators  such  as  Radon  Transform.  In 
addition,  WaRD  also  improves  on  the  performance  of  the  WaD  at  any  fixed  resolution. 
Furthermore,  WaRD  continues  to  provide  a  good  estimate  of  the  original  signal  even 
in  the  presence  of  any  ill-conditioned  system.  The  computational  complexity  of  the 
WaRD  algorithm  is  just  0(iV  logg  N)  where  N  is  the  number  of  samples. 

Theoretical  analysis  of  the  ideal  WaRD  algorithm  reveals  that  the  optimal  regular¬ 
ization  parameter  at  each  wavelet  scale  is  determined  by  the  proportion  of  distorted 
input  signal  wavelet  coefficients  that  are  greater  than  the  variance  noise  colored  by 
regularized  inversion.  From  the  expression  for  the  optimal  regularization  parameter 
it  follows  that  for  finite  data  samples,  employing  inversion  without  regularization  in  a 
wavelet-based  deconvolution  system  is  never  optimal.  Further,  using  a  regularization 
parameter  a  =  1,  which  corresponds  to  employing  a  Wiener  deconvolution  filter  for 
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inversion,  is  also  sub-optimal.  In  essence,  the  optimal  regularization  parameter  is 
simultaneously  determined  by  the  frugality  of  the  wavelet  representation  of  the  input 
signal  and  the  Fourier-domain  structure  of  the  convolution  operator. 

However,  fortunately,  the  final  performance  is  observed  to  be  quite  insensitive  to 
changes  in  the  value  of  the  regularization  parameter  around  the  optimal  value.  So,  a 
near-optimal  regularization  parameter  can  be  obtained  from  the  norm  of  the  WaRD 
solution  for  different  regularization  parameters.  As  a  guide,  in  simulations  spanning 
many  real-world  images  and  convolution  systems,  a *  was  almost  always  lay  in  the 
range  [0.2,  0.3].  In  the  1-d  example  above,  the  optimal  a  turned  out  to  be  small 
(a*  =  0.06)  because  the  test  signal  was  represented  very  economically  in  the  wavelet 
domain.  However,  choosing  a  a*  in  the  range  [0.2,  0.3]  gave  near-optimal  results  as 
well. 

There  are  several  avenues  for  future  WaRD  related  research: 

1.  We  have  focused  on  scalar  processing  during  wavelet-domain  estimation.  How¬ 
ever,  there  exist  dependencies  between  the  wavelet  coefficients  that  can  be  ex¬ 
ploited  to  improve  the  performance  of  wavelet-based  deconvolution  systems. 
We  are  currently  working  on  combining  WaRD  concepts  with  hidden  Markov 
model  (HMM)  tree  based  wavelet-estimation  [33]  to  obtain  a  better  deconvolu¬ 
tion  system.  We  believe  that  exploiting  such  inter-dependencies  in  the  wavelet 
domian  will  help  preserve  edges  and  other  spatially  localized  phenomena  (such 
as  the  lines  on  lena's  hat  better. 

2.  One  interesting  twist  to  the  approach  to  wavelet-based  deconvolution  is  to  first 
exploit  the  wavelet-domain  to  estimate  x  ®  h  from  the  noisy  observation  y  and 
then  invert  the  convolution  operator.  This  technique,  called  Vaguellete- Wavelet 
Decomposition  (VWD)  has  been  studied  by  Silverman  and  Abramovich  recently 
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[34],  The  salient  point  of  such  a  technique  is  that  the  wavelet-domain  estimation 
has  to  deal  with  white  noise  instead  of  the  complicated  colored  noise.  However, 
in  such  a  case  the  reconstruction  basis  is  no  longer  a  signal-adapted  wavelet- 
basis,  but  a  hybrid  basis  that  is  not  spatially  localized.  Further,  this  technique 
is  again  not  applicable  when  the  H  is  not  invertible.  We  find  the  construction 
of  a  universally,  applicable,  hybrid  deconvolution  scheme  that  lies  in  between 
WVD  and  VWD  promising  and  challenging. 
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Appendix  A 


Deriving  the  Optimal  Regularization  Parameter  a 


The  MSE  of  the  estimate  obtained  by  using  WaRD  is  approximated  by  the  sum  of 
the  distortion  incurred  during  Fourier-domain  regularized  inversion  and  the  error  due 
to  ideal  hard  thresholding  of  the  wavelet  coefficients. 

n=N 

MSE(a)  =  £[l-fl„(/„)]2|X(/„)|2  +  ^  minflfljy2,  o](a))  ,  (A.l) 

n=l  j,k 

_  «2o-4  A(/n)|2 _ l  V  V  mint  1(9“  I2  a2('o'T^ 

E(|ff(/„)|2|X(/„)|2+a(r2)2  +  (EX  (I  i* I  •  A  ))J  • 

where  {0®k}  are  the  wavelet  coefficients  of  the  distorted  but  noise-free  signal 
Ra(fn)X (fn)  and  a'j(a )  is  the  variance  of  noise  ( ^(f")  )  r(/n)  at  wavelet  scale  j 
(refer  Eqn.  (5.2).  j  =  j0  denote  the  coarsest  scale  coefficients  and  j  =  J  denote  the 
finest  resolution  coefficients. 

By  grouping  the  distortion  terms  according  to  the  frequency  subbands  of  a  wavelet 
system  (refer  Figure  4.2),  we  have 

N  9  4l  17  /  c  M9  J  2?  —1 
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A.l  Different  a  for  Each  Wavelet  Scale 

MSE(o;)  in  A. 3  is  derived  assuming  that  the  the  same  regularization  parameter  a  for 
each  wavelet  scale.  An  interesting  generalization  is  to  have  different  regularization 
parameters  <x,  for  each  wavelet  scale  j  (refer  Figure  5.2).  Assuming  that  the  wavelets 
at  different  scales  have  no  overlapping  frequencies,  the  cost  function  can  be  written 
as  a  function  of  the  different  regularization  parameters  as 

„  .  „  ^a?rr4\X(  AV2 

MSE(o;Jq,  cTjq,  ■  •  • ,  otj— i) 


£  £ 


t,  (\H(fn)\2\X(fn)\2  +  aj(jiy 

+  £  ( £min(iC*i^aj-(aj))  I  (A-3) 


3=30  \n=2>~ 
J  (  2i 


j=jo  \k= 1 

Note  that  the  cost  function  MSE(o;)  is  separable  with  respect  to  the  different  regular¬ 
ization  parameters  aj.  We  attempt  to  find  the  optimal  regularization  parameters  at 
each  scale  by  partially  differentiating  the  cost  function  with  respect  to  <x,  and  setting 
the  derivative  to  zero. 


A. 2  Differentiating  Distortion  Terms 


We  denote  the  total  distortion  error  by 

J-l  2-7-1 

D (ajo,...,aj)  :=Y^ 
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Taking  the  partial  derivative  of  D(<x,0, . . . ,  aj )  with  respect  to  ap  we  have 
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(A.4) 


(A.5) 


Note  that  the  partial  derivative  of  the  distortion  is  zero  at  ap  =  0.  For  the  sake  of 
convenience,  we  define 

Aa4\H{fn)¥\X{fn)\4 
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Therefore, 


(D  (o!j0 .  .  .  •  ,  CTj)  )  —  Clip  C (ttp) 


(A.7) 


A. 3  Differentiating  Ideal  Thresholding  Terms 

The  error  due  to  ideal  thresholding  at  each  wavelet  scale  p  is  given  by 
2p 

Emind«i2^,2K))  =  «*)2+  °l > 
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rap(ap) 


iCi  >ffp(“p) 


f  /«)  «p)2  du#”  (A. 8) 

Jo 

+  2P  I1-  fQ  f(wpv)dwp*  \  ^(op),  (A-9) 


+  2M  1 


where 


2p 


(A.  10) 


acts  like  the  probability  density  of  the  magnitude  of  the  wavelet  coefficients  of  the 
distorted  but  noise- free  signal  at  scale  p.  f(wpp )  is  not  smooth  but  can  be  approxi¬ 
mated  arbitrarily  closely  by  some  smooth  density  f(wp).  Using  this  smooth  density 
f(wpp )  instead  of  f(wpp )  in  expression  A. 9, 


j^min(|0jJ|VJ(afc)) 


+  2*1 
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1  ~  J  ~f{wpp)  dw^\  o-J(ap),  (A.  11) 


Let  the  total  thresholding  error  be  denoted  by 


T(a*,  min(|0“i|2,  cr|(oy)) 


(A.  12) 
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Taking  the  partial  derivative  of  T (<x,0, . . . ,  ctj)  with  respect  to  ap  we  have 
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We  recollect  that  the  purpose  of  Fourier-domain  regularization  is  to  shrink  noise 
components  which  explode  during  inversion  of  ill-conditioned  /H\  the  residual  noise 
is  tackled  using  shrinkage  in  the  wavelet-domain.  Since  excessive  Fourier-domain 
regularized  inversion  distorts  the  signal  and  smears  the  spatially  varying  features 
such  as  edges,  we  aim  to  keep  the  amount  of  Fourier-domain  regularization  minimal. 

Fortunately,  the  rogue  noise  components  at  frequencies  where  \H(fn)  \  &  0,  which 
prevent  effective  wavelet-domain  signal  estimation,  are  greatly  attenuated  even  for 
small  values  of  ap  (refer  Eqn.  (2.7).  Once  these  rogue  noise  components  are  attenu¬ 
ated,  we  expect  wavelet-domain  signal  estimation  to  be  effective.  Hence,  it  is  justified 
to  expect  a  optimal  regularization  parameter  that  achieve  drastic  noise  attenuation 
but  minimal  distortion.  So,  we  also  expect  that  any  small  change  in  the  ap  about 
its  optimal  value  would  cause  much  larger  changes  in  amount  of  noise  attenuated 
than  the  amount  of  signal  distorted.  So  we  neglect  any  changes  in  distribution  of  the 


wavelet  coefficients  with  respect  to  ap  in  comparison  to  changes  in  the  variance  the 
noise  ap(ap).  This  assumption  simplifies  Eqn.  (A.  13)  to 


where  #  (\6^k\2  >  ap(ap))  is  the  number  of  wavelet  coefficients  at  scale  p  that  possess 
energy  greater  than  the  noise  variance  (Jp{ap)  in  scale  p. 
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The  noise  variance  at  scale  p  is  equal  to 

=  fMww,  2 

p)  t^mfn)\2\x(fnW  +  ^y 

„  2  V  (2~^)2  \H(fn)\2\X(fn)\4  , 

~  „^_1(l^(/r1)|2|^(/.)|2  +  ^2)2  ’ 


(A.15) 


where  the  approximation  in  Eqn.  (A.15)  arises  from  assuming  that  the  wavelets  at 


scale  p  are  perfectly  bandlimited.  The  derivative  of  a2  (ap)  with  respect  to  ap  is  given 


by 


d<Jp{ap ) 

dap 


d 

dap 


2P-1 


^  E 

.  n=2f~1 


(\H(fn)\2\X(fn)\2  +  apa2Y 


n2XY  — 2ct42~p  \H(fn)\2\X  (fn)\4 

n^-A\H(fn)\2\X(fn)\2  +  apa*y 

v-1  4a42~p  \H(fn)\2\X(fn)\4 

n^A\H(fn)\2\X(fn)\2  +  apa*y 

-2  -pC(ap) 


(A.16) 


Combining  Eqn.  (A.  14)  and  Eqn.  (A.16)  we  have 

JL(T(aj0,...,aj))  =  -i#(| >  ^2M  (C(aP))  (A.  17) 

The  partial  derivative  of  MSE(ajo,  ctj0, . . . ,  olj-i)  with  respect  to  ap  is 

X  (MSE(ajo,a;,0,...,aJ_i))  ss  X  (DlVr,,. . . . +  T(Oj„. ....  o./jj 

(A.  18) 


Using  (A. 7)  and  (A. 17), 

(9  1 

—  (mSE(q;-o,  <x,0, . . . ,  aj-x))  «  C'(ap)  -  —  #  >  ^(«P))  C(ap) 

=  -  7^#  (iCfcl  >  aJK)))  CK)  (A-19) 

Setting  Eqn.  (A.  19)  to  zero,  we  get  the  optimal  regularization  parameter 

ap  ~  (iCfcl  >  AK)) 


(A. 20) 
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Appendix  B 

Error  Decay  Rates  for  LTI  filters 

Let  the  input  x  be  a  piece-wise  polynomial  signal  as  shown  in  Figure  B.l.  Since  x 
is  discontinuous,  its  energy  decays  as  0( j)  in  the  frequency  domain.  Consider  N 
equally  spaced  samples  of  the  signal.  In  the  Fourier  domain,  the  energy  change  in 
the  first  N  Fourier  coefficients  due  to  aliasing  is  less  than  a  multiplicative  factor  of 
two,  hence  we  neglect  the  effects  of  aliasing  and  approximate  the  the  magnitude  of 
the  DFT  coefficients  of  the  sampled  signal  x  as 

JN  N 

\X(fn)  |«—  n=  1 . —  (B.l) 

Ti  £ 


Figure  B.l  :  The  signal  shown  in  the  figure  is  piece-wise  polynomial.  The  polynomial 
“piece”  on  the  left  of  the  discontinuity  is  of  degree  4  and  the  polyomial  “piece”  on  the 
right  is  of  degree  8.  The  Fourier  coefficients  of  such  a  signal  decay  very  slowly  due  to  the 
presence  of  the  discontinuity  in  the  signal.  However,  wavelets  with  8  vanishing  moments 
represent  such  a  piece-wise  polynomial  signal  very  economically  using  just  0(log2  N)  non¬ 
zero  coefficients. 
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where  n  =  1 . . .  y  indexes  the  discrete  Fourier  coefficients  of  the  positive  frequencies. 
The  total  energy  Exs  of  the  signal  x  is  approximately  given  by 


Es 


N 


(for  large  N) 


(B-2) 


The  variance  of  additive  white  Gaussian  noise  is  a2  at  each  time  sample,  so  the 
variance  T(/n)  at  all  discrete  Fourier  components  is  also  a2.  The  total  noise  energy 
is  E1  is  No2.  Note  that  the  ratio  of  the  average  signal  energy  per  sample  and  the 
average  noise  energy  per  sample  are  kept  constant  with  increasing  resolution  N . 

Consider  any  low-pass  convolution  operator  %  that  decays  polynomially  in  the 
Fourier  domain  (eg.  Radon  transform).  The  Fourier  coefficients  of  such  a  convolution 
operator  can  be  written  as 

n 

\H(fn)\  ~  —  n  =  1 ...  y,  for  some  v  >  0 

Tl 

(B-3) 


Since  the  constant  C  turns  out  to  be  inconsequential  asymptotically,  we  choose  to 
ignore  it. 


n 


n 


1  E 

J-  .  .  .  2 


(B-4) 


The  variance  of  noise  colored  by  the  inversion  of  %  at  each  frequency  fn  is  given 


by  a2  ti2”,  for  n  =  1 The  best  LTI  estimate  of  the  input  signal  is  given  by 


XhTliJn) 


1 


\H(fn)\2X(fn)\' 


H {fn)  \H(fn)\2\X(fn)\2  +  o: 


X(fnW 


\x  {fn)\2  +  <72  n21' 


X {fn)  + 


;Y{fn) 

r  (fn) 
E  {fn) 


(B.5) 
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The  average  MSE  of  such  an  optimal  LTI  estimate  is 


(  4|A(/n)|2a2 n2l/ 


(B.6) 


N  \  \X(fn)\2  +  a2  n2u  J 

where  £  {.}  denotes  expectation  with  respect  to  the  noise  statistics.  It  is  easily  shown 
that  that 


+  a 1  n 


2  r,2  v 


o  •  (w<t\\2  2  2^  /  (  4  \X{fn)\2<72n2^ 

2  mm  [}X(fn)\  ,  o  n  )  <  ^ 

Hence, 

N/2 


<  4  min  (\X (fn)\2 , a2 n2 (B.7) 


2 

N 


N/2 


X/nin  (  X{Jn)f,a2  ri21')  <  ^£  { |XLr/ -  X|2}  <  J^min  (\X(fn)\2,a2  n2u) 

n= 1  n= 1 

(B.8) 

The  colored  noise  variance  at  frequency  fn  increases  monotonically  with  increasing  n. 
In  contrast,  the  signal  energy  decreases  monotonically  with  n.  Let  fq  denote  the  DFT 
frequency  where  the  signal  coefficient  energy  is  approximately  equal  to  the  colored 
noise  variance.  The  index  q  is  given  by 


Q 
1 2 


(iVcr2) 21,1+2  ,q  e  y}- 


(B.9) 


\X(fn)\2  <  a2  n21' 
and  \X(fn)\2  >  a2  ri2v 


for  all  n  >  q 
for  all  n  <  q 


Hence’^SZ  mm(\X(fn)\2,a2n2u)  =  \x( /> 


n=  1 


^n=l 


n=i 7+1 


2  q  2 

>  —  a2  + 

“IV  IV 


2  ™  JV 


E 


n2 


2 q 
N 


a2  +  2 


2  v+3 

(a2)2l/+ 2 

“27+1 

_/V  2  "+2 


+ 


n=g+l 

6  IV 
2 


-  2 


7T2  1 


IV  a5 


6  q 

1 

2  z/  +  2  4 


IV 


=  o 


iV(2-+2) 


(B.10) 
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Thus,  the  rate  at  which  the  average  error  per  sample  decays  for  the  best  LTI  estimator 
is  O  ( N  2^+2 ) . 
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Appendix  C 

Error  Decay  for  Wavelet-based  Techniques 

Again,  we  consider  a  piece- wise  polynomial  function  input  signal  (refer  Figure  B.l). 
Consider  a  wavelet  system  such  that  the  number  of  zero  moments  of  the  wavelet 
basis  functions  is  greater  than  or  equal  to  the  maximal  degree  of  of  the  polynomial 
pieces  in  the  input  signal.  Such  a  wavelet  system  represent  piece-wise  polynomial 
functions  very  economically.  All  the  wavelet  coefficients  corresponding  to  the  wavelet 
basis  functions  that  are  fully  contained  in  the  smooth  polynomial  region  of  the  signal 
are  zero  [17].  The  wavelet  coefficients  are  non-zero  only  if  supports  of  the  corre¬ 
sponding  wavelet  basis  function  contains  any  point  of  discontinuity.  Since  there  are 
log2  N  scales  in  wavelet  decomposition,  the  number  of  non-zero  wavelet  coefficients  is 
0(log2lV).  For  the  sake  of  simplicity,  we  assume  that  there  are  exactly  log 2  IV  non¬ 
zero  coefficients.  The  energy  of  the  non-zero  wavelet  coefficients  whose  basis  support 
contains  the  discontinuous  event  decay  as  2~p  with  increasing  wavelet  scales  p  [14]. 
So  the  wavelet  coefficient  at  scale  p  is  given  by 

\wP,k\2  ~  2~p  \w0,k\2  for  p  =  1, ... .  log2  N  (C.l) 

The  total  signal  energy  is  given  by  (2  —  A)  |u>0;/c|2.  With  denser  sampling,  the  energy 
of  the  signal  increases  proportional  to  N.  So  the  energy  of  the  coarsest  scale  |u>o,/cj2 
wavelet  coefficient  increases  as  N  C,  where  C  is  some  non-zero  constant. 

\wp^\2  ~  NC  forp  =  l, - log2iV  (C.2) 


When  the  wavelets  have  sufficient  zero  moments,  they  are  well-localized  in  fre- 
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quency  and  nearly  band-limited.  Then,  the  variance  of  the  colored  noise  at  wavelet 
scale  p  after  inversion  with  H is  approximately  given  by 


Note  that  the  signal  energy  per  sample  and  the  noise  energy  per  sample  stays  constant. 
We  consider  the  error  that  results  from  employing  ideal  scale-dependent  thresholding 
of  the  wavelet  coefficients,  where  ideal  thresholding  is  defined  in  5.3.  The  average  error 
per  sample  for  an  ideal  thresholding  estimator  is  given  by  JT  k  min(|roj!fc|2,  oj).  The 
colored  noise  variance  monotonically  increases  with  increasing  scale  p.  In  contrast, 
the  signal  energy  decays  with  scale  p.  Let  q  denote  the  scale  where  the  signal  wavelet 
coefficient  energy  is  approximately  equal  to  the  colored  noise  variance.  Then, 

(C  N  \ 

9  *  log2  )  (C.4) 
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which  is  clearly  faster  that  the  rate  achieved  by  LTI  estimators  asymptotically  with 
increasing  N. 
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